400 research outputs found
Use the Detection Transformer as a Data Augmenter
Detection Transformer (DETR) is a Transformer architecture based object
detection model. In this paper, we demonstrate that it can also be used as a
data augmenter. We term our approach as DETR assisted CutMix, or DeMix for
short. DeMix builds on CutMix, a simple yet highly effective data augmentation
technique that has gained popularity in recent years. CutMix improves model
performance by cutting and pasting a patch from one image onto another,
yielding a new image. The corresponding label for this new example is specified
as the weighted average of the original labels, where the weight is
proportional to the area of the patches. CutMix selects a random patch to be
cut. In contrast, DeMix elaborately selects a semantically rich patch, located
by a pre-trained DETR. The label of the new image is specified in the same way
as in CutMix. Experimental results on benchmark datasets for image
classification demonstrate that DeMix significantly outperforms prior art data
augmentation methods including CutMix.Comment: 13 page
Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data
Due to its causal semantics, Bayesian networks (BN) have been widely employed
to discover the underlying data relationship in exploratory studies, such as
brain research. Despite its success in modeling the probability distribution of
variables, BN is naturally a generative model, which is not necessarily
discriminative. This may cause the ignorance of subtle but critical network
changes that are of investigation values across populations. In this paper, we
propose to improve the discriminative power of BN models for continuous
variables from two different perspectives. This brings two general
discriminative learning frameworks for Gaussian Bayesian networks (GBN). In the
first framework, we employ Fisher kernel to bridge the generative models of GBN
and the discriminative classifiers of SVMs, and convert the GBN parameter
learning to Fisher kernel learning via minimizing a generalization error bound
of SVMs. In the second framework, we employ the max-margin criterion and build
it directly upon GBN models to explicitly optimize the classification
performance of the GBNs. The advantages and disadvantages of the two frameworks
are discussed and experimentally compared. Both of them demonstrate strong
power in learning discriminative parameters of GBNs for neuroimaging based
brain network analysis, as well as maintaining reasonable representation
capacity. The contributions of this paper also include a new Directed Acyclic
Graph (DAG) constraint with theoretical guarantee to ensure the graph validity
of GBN.Comment: 16 pages and 5 figures for the article (excluding appendix
Adversarial Feature Stacking for Accurate and Robust Predictions
Deep Neural Networks (DNNs) have achieved remarkable performance on a variety
of applications but are extremely vulnerable to adversarial perturbation. To
address this issue, various defense methods have been proposed to enhance model
robustness. Unfortunately, the most representative and promising methods, such
as adversarial training and its variants, usually degrade model accuracy on
benign samples, limiting practical utility. This indicates that it is difficult
to extract both robust and accurate features using a single network under
certain conditions, such as limited training data, resulting in a trade-off
between accuracy and robustness. To tackle this problem, we propose an
Adversarial Feature Stacking (AFS) model that can jointly take advantage of
features with varied levels of robustness and accuracy, thus significantly
alleviating the aforementioned trade-off. Specifically, we adopt multiple
networks adversarially trained with different perturbation budgets to extract
either more robust features or more accurate features. These features are then
fused by a learnable merger to give final predictions. We evaluate the AFS
model on CIFAR-10 and CIFAR-100 datasets with strong adaptive attack methods,
which significantly advances the state-of-the-art in terms of the trade-off.
Without extra training data, the AFS model achieves a benign accuracy
improvement of 6% on CIFAR-10 and 9% on CIFAR-100 with comparable or even
stronger robustness than the state-of-the-art adversarial training methods.
This work demonstrates the feasibility to obtain both accurate and robust
models under the circumstances of limited training data
ProtoDiv: Prototype-guided Division of Consistent Pseudo-bags for Whole-slide Image Classification
Due to the limitations of inadequate Whole-Slide Image (WSI) samples with
weak labels, pseudo-bag-based multiple instance learning (MIL) appears as a
vibrant prospect in WSI classification. However, the pseudo-bag dividing
scheme, often crucial for classification performance, is still an open topic
worth exploring. Therefore, this paper proposes a novel scheme, ProtoDiv, using
a bag prototype to guide the division of WSI pseudo-bags. Rather than designing
complex network architecture, this scheme takes a plugin-and-play approach to
safely augment WSI data for effective training while preserving sample
consistency. Furthermore, we specially devise an attention-based prototype that
could be optimized dynamically in training to adapt to a classification task.
We apply our ProtoDiv scheme on seven baseline models, and then carry out a
group of comparison experiments on two public WSI datasets. Experiments confirm
our ProtoDiv could usually bring obvious performance improvements to WSI
classification.Comment: 12 pages, 5 figures, and 3 table
Catalytic Asymmetric Reactions between Alkenes and Aldehydes
This doctoral work describes catalytic asymmetric reactions between alkenes and
aldehydes, enabled by the development of chiral Brønsted acids. Valuable and
functionalized enantiomerically enriched cyclic compounds were efficiently furnished
from inexpensive and commercially available reagents with high degrees of atom
economy.
In the first part of this thesis, the first highly enantioselective organocatalytic
intramolecular carbonyl−ene cyclization of olefinic aldehydes is presented. In the second
part, asymmetric cyclizations via oxocarbenium ions are described. One is a general
asymmetric catalytic Prins cyclization of aldehydes with homoallylic alcohols, in which
the oxocarbenium ion is attacked intramolecularly by a pendent alkene. The other one is
an asymmetric oxa-Pictet−Spengler reaction between aldehydes and homobenzyl
alcohols, in which the oxocarbenium ion is trapped by an intramolecular arene. The first
general asymmetric [4+2]-cycloaddition of simple and unactivated dienes with aldehydes
is developed in the last part of this thesis. This methodology is extremely robust and
scalable. Valuable enantiomerically enriched dihydropyran compounds could be readily
obtained from inexpensive and abundant dienes and aldehydes.
New types of confined Brønsted acids were rationally designed and synthesized,
including imino-imidodiphosphates (iIDPs), nitrated imidodiphosphates (nIDPs), and
imidodiphosphorimidates (IDPis). Beyond the application of these catalysts in various
asymmetric reactions between simple alkenes and aldehydes, mechanistic investigations
are also disclosed in this doctoral work
The Determinants of Going Concern Audit Opinions: Evidence from Shanghai Stock Exchange over 2009 to 2011
This research examines the factors which can affect auditors to issue going concern audit opinions in the Chinese stock market, and furthermore, to discuss the audit quality among Chinese reporting system through the issuance of going concern opinions. Firstly, the data results illustrate that Chinese auditors are more likely to issue going concern audit opinions to those listed companies with poor profitability, low liquidity, less cash inflows and high leverage. Secondly, it can be examined that audit fee paid by a client has no statistically significant effect on impacting auditors to give going concern audit opinions. Finally, comparing with Big-Four and non-Big four audit firms, no significant difference between two types of auditors on which one have more preference to issue going concern audit opinions is found. Generally, the research shows a mature audit profession in developing countries like China is being formed, that audit independence is not compromised by economic dependence and the majority of Chinese auditors mainly issue going concern opinions based on financial distress indicators. Besides, this research paper indicates additional discussion about improvements on audit quality and independence since the new audit standard for going concern is promulgated in 2003.
Key Words: going concern audit opinions, risk of bankruptcy, fee independence, size of audit firms, improvements on audit quality and independenc
Q2ATransformer: Improving Medical VQA via an Answer Querying Decoder
Medical Visual Question Answering (VQA) systems play a supporting role to
understand clinic-relevant information carried by medical images. The questions
to a medical image include two categories: close-end (such as Yes/No question)
and open-end. To obtain answers, the majority of the existing medical VQA
methods relies on classification approaches, while a few works attempt to use
generation approaches or a mixture of the two. The classification approaches
are relatively simple but perform poorly on long open-end questions. To bridge
this gap, in this paper, we propose a new Transformer based framework for
medical VQA (named as Q2ATransformer), which integrates the advantages of both
the classification and the generation approaches and provides a unified
treatment for the close-end and open-end questions. Specifically, we introduce
an additional Transformer decoder with a set of learnable candidate answer
embeddings to query the existence of each answer class to a given
image-question pair. Through the Transformer attention, the candidate answer
embeddings interact with the fused features of the image-question pair to make
the decision. In this way, despite being a classification-based approach, our
method provides a mechanism to interact with the answer information for
prediction like the generation-based approaches. On the other hand, by
classification, we mitigate the task difficulty by reducing the search space of
answers. Our method achieves new state-of-the-art performance on two medical
VQA benchmarks. Especially, for the open-end questions, we achieve 79.19% on
VQA-RAD and 54.85% on PathVQA, with 16.09% and 41.45% absolute improvements,
respectively
- …